Confidence Intervals and Hypothesis Testing for High-Dimensional Statistical Models

نویسندگان

  • Adel Javanmard
  • Andrea Montanari
چکیده

Fitting high-dimensional statistical models often requires the use of non-linear parameter estimation procedures. As a consequence, it is generally impossible to obtain an exact characterization of the probability distribution of the parameter estimates. This in turn implies that it is extremely challenging to quantify the uncertainty associated with a certain parameter estimate. Concretely, no commonly accepted procedure exists for computing classical measures of uncertainty and statistical significance as confidence intervals or p-values. We consider here a broad class of regression problems, and propose an efficient algorithm for constructing confidence intervals and p-values. The resulting confidence intervals have nearly optimal size. When testing for the null hypothesis that a certain parameter is vanishing, our method has nearly optimal power. Our approach is based on constructing a ‘de-biased’ version of regularized Mestimators. The new construction improves over recent work in the field in that it does not assume a special structure on the design matrix. Furthermore, proofs are remarkably simple. We test our method on a diabetes prediction problem.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Confidence Intervals for the Power of Two-Sided Student’s t-test

For the power of two-sided hypothesis testing about the mean of a normal population, we derive a 100(1 − alpha)% confidence interval. Then by using a numerical method we will find a shortest confidence interval and consider some special cases.

متن کامل

Confidence intervals and hypothesis testing for high-dimensional regression

Fitting high-dimensional statistical models often requires the use of non-linear parameter estimation procedures. As a consequence, it is generally impossible to obtain an exact characterization of the probability distribution of the parameter estimates. This in turn implies that it is extremely challenging to quantify the uncertainty associated with a certain parameter estimate. Concretely, no...

متن کامل

Joint Confidence Regions

Confidence intervals are one of the most important topics in mathematical statistics which are related to statistical hypothesis tests. In a confidence interval, the aim is that to find a random interval that coverage the unknown parameter with high probability. Confidence intervals and its different forms have been extensively discussed in standard statistical books. Since the most of stati...

متن کامل

Measuring Hospital Performance Using Mortality Rates: An Alternative to the RAMR

Background The risk-adjusted mortality rate (RAMR) is used widely by healthcare agencies to evaluate hospital performance. The RAMR is insensitive to case volume and requires a confidence interval for proper interpretation, which results in a hypothesis testing framework. Unfamiliarity with hypothesis testing can lead to erroneous interpretations by the public and other stakeholders. We argue t...

متن کامل

LINEAR HYPOTHESIS TESTING USING DLR METRIC

Several practical problems of hypotheses testing can be under a general linear model analysis of variance which would be examined. In analysis of variance, when the response random variable Y , has linear relationship with several random variables X, another important model as analysis of covariance can be used. In this paper, assuming that Y is fuzzy and using DLR metric, a method for testing ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013